perm filename PATAIM[4,KMC] blob
sn#110830 filedate 1974-07-05 generic text, type T, neo UTF8
00100 COMMENT ā VALID 00002 PAGES
00200 C REC PAGE DESCRIPTION
00300 C00001 00001
00400 C00002 00002
00500 C00038 ENDMK
00600 Cā;
00100
00200
00300 PATTERN-MATCHING RULES FOR THE RECOGNITION OF
00400 NATURAL LANGUAGE DIALOGUE EXPRESSIONS
00500
00600 Kenneth Mark Colby
00700 and
00800 Roger C. Parkison
00900
01000
01100 INTRODUCTION
01200
01300 To recognize something is to identify it as an instance of
01400 the "same again". This familiarity is possible because of recurrent
01500 characteristics of the world which repeat themselves. We shall
01600 describe an algorithm which recognizes recurrent characteristics of
01700 natural language dialogue expressions. It utilizes a multi-stage
01800 sequence of pattern-matching rules for progressively transforming an
01900 input expression until it eventually matches an abstract stored
02000 pattern. The stored pattern has a pointer to a response function in
02100 memory which decides what to do once the input has been recognized.
02200 Here we discuss only the recognizing functions, except for one
02300 response function (anaphoric substitution) which interactively aids
02400 the recognition process. Details of how the response functions
02500 operate will be described in a future communication by William Faught
02600 and ourselves.
02700 We are constructing and testing a simulation of paranoid
02800 thought processes; our problem is to reproduce paranoid linguistic
02900 behavior in a teletyped diagnostic psychiatric interview. The
03000 diagnosis of paranoid states, reactions or modes is made by
03100 clinicians who judge the degree of correspondence between what they
03200 observe in an interview and their conceptual model of paranoid
03300 behavior. There exists a high degree of agreement among
03400 psychiatrists about this conceptual model which relies mainly on what
03500 an interviewee says and how he says it.
03600 Natural language is a life-expressing code which people use
03700 for communication with themselves and others. In a real-life
03800 dialogue such as a psychiatric interview, the participants have
03900 interests, intentions, and expectations which are revealed in their
04000 linguistic expressions. An interactive simulation of a paranoid
04100 patient must be able to demonstrate typical paranoid linguistic
04200 behavior. To achieve this effect, our paranoid model must have the
04300 ability to deal with the teletyped messages of an interviewer.
04400 A number of approaches have been taken for dealing with
04500 natural language dialogue expressions. (Winograd,1972; Woods,1970).
04600 These approaches rely on parsers which conduct a detailed syntactic
04700 and semantic analysis. They perform well for the purposes for which
04800 they were designed. Their weakness, for our purposes, lies in their
04900 lack of neglecting and ignoring mechanisms. Such mechanisms are
05000 necessary in a program which accepts and responds to unrestricted
05100 conversational English characterized by expressions novel to the
05200 program.
05300 How humans process natural language is largely unknown. They
05400 possess some knowledge of grammatical rules, but this fact does not
05500 entail that they use a grammar in interpreting and producing
05600 language. It seems implausible to us that people possess full
05700 transformational grammars for processing language. Language is
05800 what is recognized but the processes involved may not be linguistic
05900 or grammatical. Originally transformational grammars were not
06000 designed to "understand" a large subset of English; they constituted
06100 a formal method for deciding whether a string is grammatical.
06200 An analysis of what one's problem actually is should guide
06300 the selection or invention of methods appropriate to its solution.
06400 Our problem is not to develop a consistent and general theory of
06500 language nor to assert empirically testable hypotheses about how
06600 people process language. Our problem is to design an algorithm
06700 which recognizes what is being said in a dialogue and what is being
06800 said about it in order to make a response such that a sample of I-O
06900 pairs from the paranoid model is judged similar to a sample of I-O
07000 pairs from paranoid patients. The design task belongs to
07100 artificial intelligence in which the criterion is how adequately the
07200 computer program performs mind-like functions. New methods had to be
07300 devised for an algorithm to participate in a human dialogue in a
07400 paranoid-patient-like way. We sought effective methods which could
07500 operate efficiently in real time. Since our method provides a
07600 general way of many-to-one mapping from surface expressions to a
07700 single stored pattern, it is not limited to the simulation of
07800 paranoia, but can be used by any type of "host" system which takes
07900 natural language as input.
08000 Our method is to transform the input until a pattern is
08100 obtained which matches completely or partially a more abstract stored
08200 pattern. This strategy has proved adequate for our purposes a
08300 satisfactory percentage of the time. The power of this method for
08400 natural language dialogues lies in its ability to ignore as
08500 irrelevant some of what it recognizes and everything it does not
08600 recognize at all. A linguistic parser doing word-by-word,
08700 parts-of-speech analysis fails when it cannot find one or more of the
08800 input words in its dictionary. A system that must know every word is
08900 too fragile for unrestricted dialogues.
09000 In early versions of the paranoid model, such as PARRY1, some
09100 of the pattern recognition mechanisms allowed the elements of the
09200 pattern to be order independent (Colby, Weber, and Hilf, 1971). For
09300 example, consider the following expressions:
09400 (1) WHERE DO YOU WORK?
09500 (2) WHAT SORT OF WORK DO YOU DO?
09600 (3) WHAT IS YOUR OCCUPATION?
09700 (4) WHAT DO YOU DO FOR A LIVING?
09800 (5) WHERE ARE YOU EMPLOYED?
09900 In PARRY1 a procedure scans these expressions looking for an
10000 information-bearing contentive such as "work", "for a living", etc.
10100 When it finds such a contentive along with "you" or "your" in the
10200 expression, regardless of word order, it responds to the expression
10300 as if it were a question about the nature of one's work. This method
10400 correctly classifies the five sentences above. Unfortunately, it
10500 includes the two examples below in the same category:
10600 (6) DOES YOUR FATHER'S CAR WORK?
10700 (7) HOW DID THINGS WORK OUT FOR YOU?
10800 An insensitivity to word order has the advantage that lexical items
10900 representing different parts of speech can represent the same
11000 concept,e.g. the word "work" represents the same concept whether is
11100 used as a noun or a verb. But a price is paid for this resilience and
11200 elasticity. We find from experience that, since English relies
11300 heavily on word order to convey the meaning of its messages, the
11400 average penalty of misunderstanding (to be distinguished from
11500 ununderdstanding), is too great. Hence in PARRY2, as will be
11600 described shortly, all the patterns require a specified word order.
11700 For high-complexity problems it is helpful to have
11800 constraints. Diagnostic psychiatric interviews (and especially
11900 those conducted over teletypes) have several natural constraints.
12000 First, clinicians are trained to ask certain questions in certain
12100 ways. This limits the number of patterns required to recognize
12200 utterances about each topic. Second, only a few hundred standard
12300 topics are brought up by interviewers who are, furthermore, trained
12400 to use everyday expressions and especially those used by the patient
12500 himself. When the interview is conducted by teletypes, expressions
12600 tend to be shortened since the interviewer tries to increase the
12700 information transmission rate over the slow channel of a teletype.
12800 Finally, teletyped interviews represent written utterances and
12900 utterances are known to be highly redundant such that unrecognized
13000 words can be ignored without losing the meaning of the message. Also
13100 utterances are loaded with idioms, cliches, pat phrases, etc. -
13200 all being easy prey for a pattern-matching approach. It is
13300 time-wasting and usually futile to try to decode an idiom by
13400 analyzing the meanings of its individual words.
13500 We now describe the pattern-matching functions of the
13600 algorithm in some detail. (See Fig. 1 for a diagram of the overall
13700 flow of control).
13800
13900
14000 OVERVIEW
14100
14200 PARRY2 has two primary modules. The first attempts to
14300 RECOGNIZE the input and the second RESPONDS. This paper is primarily
14400 about the RECOGNIZE module. It functions independently of the
14500 RESPOND module except in the case of pronoun references, which the
14600 RESPOND module provides to the RECOGNIZER on request.
14700 The recognition module has 4 main steps:
14800 1) Identify the words in the question and convert them to
14900 internal synonyms.
15000 2) Break the input into segments at certain bracketing words.
15100 3) Match each segment (independently) to a stored pattern.
15200 4) Match the resulting list of recognized segments to a stored
15300 compound pattern.
15400 Each of these steps, except the segmenting, throws away what it
15500 cannot identify. Occasionally a reference to an unknown topic is
15600 mis-recognized as some familiar topic.
15700
15800 PREPROCESSING
15900
16000 Each word in the input expression is first looked up in a
16100 dictionary of (currently) about 1900 entries which, for the sake of
16200 speed, is maintained in core during run-time. Illustrative examples
16300 ***** KEN *****
16400 The dictionary is given in ...
16500 ***** KEN *****
16600 of dictionary entries are given in Appendix 2. The dictionary, which
16700 was built empirically from thousands of teletyped interviews with
16800 previous versions of the model, consists of words, groups of words,
16900 and names of word-classes they can be translated into. The size of
17000 the dictionary is determined by the patterns, i.e. a word is not
17100 included unless it appears in one or more patterns. Entries in the
17200 dictionary reflect PARRY2's main interests. If a word in the input
17300 ***** KEN ***** THIS FEATURE WAS ADDED RECENTLY
17400 is not in the dictionary, it is checked to see if it ends with one of
17500 the common suffixes given in appendix 2.5. If it does, the suffix is
17600 removed and the remaining word is looked up again. If it still....
17700 ***** KEN *****
17800 is not in the dictionary, it is dropped from the pattern being
17900 formed. Thus if the input is:
18000 WHAT IS YOUR CURRENT OCCUPATION?
18100 and the word "current" is not in the dictionary, the pattern at
18200 this stage becomes:
18300 ( WHAT IS YOUR OCCUPATION )
18400 The question-mark is thrown away as redundant since questions are
18500 recognized by word order. (A statement followed by a question mark
18600 (YOU GAMBLE?) is responded to in the same way as that statement
18700 followed by a period.) Synonymic translations of words are made so
18800 that the pattern becomes, for example:
18900 ( WHAT BE YOU JOB )
19000 Some groups of words (i.e. idioms) are translated as a group
19100 so that, for example, "for a living" becomes "for job". Certain other
19200 juxtaposed words are contracted into a single word, e.g. "place of
19300 birth" becomes "birthplace". This is done to deal with groups of
19400 words which are represented as a single element in the stored
19500 pattern, thereby preventing segmentation from occurring at the wrong
19600 places, such as at a preposition inside an idiom or phrase. Besides
19700 these contractions, certain expansions are made so that for example,
19800 "DON'T" becomes "DO NOT" and "I'D" becomes "I WOULD".
19900 Misspellings can be the bane of teletyped interviews for an
20000 algorithm. Here they are handled in two ways. First, common
20100 misspellings of important words are simply put in the dictionary.
20200 Thus "yuu" is known to mean "you". The apostrophe is often omitted
20300 from contractions so most contractions are recognized with or without
20400 it. These common misspellings were gathered from over 4000 interviews
20500 with earlier versions of the paranoid model.
20600 Second, five common forms of typing error are checked
20700 systematically. These are:
20800 1) Doubled letter
20900 2) Extraneous letter
21000 3) Forgetting to hold the "shift key" for an apostrophe
21100 4) Hitting a nearby key on the keyboard
21200 5) Transposing two letters in a word
21300 The first three errors can be corrected by deleting the offending
21400 character from the word. This is accomplished by deleting each
21500 character in turn until the word is recognized. The fourth type of
21600 error is only checked for eight of the more common near misses. These
21700 were also empirically determined and involve the letter pairs (T Y),
21800 (Q W), (Y U), (I O), (G H), (O P), (A S), and (N M). These methods
21900 are all based on typing errors, but they also correct some legitimate
22000 English spelling errors. Two-letter transposition corrects, for
22100 example, "beleive" to "believe".
22200
22300 SEGMENTING
22400
22500 Another weakness in the crude pattern matching of PARRY1 is
22600 that it takes the entire input expression as its basic processing
22700 unit. If only two words are recognized in an eight word utterance,
22800 the risk of misunderstanding is great. We need a way of dealing with
22900 units shorter than the entire input expression.
23000 Aided by a heuristic from work in machine-translation (Wilks,
23100 1973 ), we devised a way of bracketing the pattern constructed up to
23200 this point into shorter segments using prepositions, wh-forms,
23300 certain verbs, etc. as bracketing points. (A list of the bracketing
23400 terms appears in Fig. 2). These points tend to separate
23500 prepositional phrases and embedded clauses from the main clause.
23600 The new pattern formed is termed either "simple", having no
23700 delimiters within it, or "compound", i.e., being made up of two or
23800 more simple patterns. A simple pattern might be:
23900 ( WHAT BE YOU JOB )
24000 whereas a compound pattern would be:
24100 (( WHY BE YOU ) ( IN HOSPITAL ))
24200 Our experience with this method of segmentation shows that compound
24300 patterns from teletyped psychiatric dialogues rarely consist of more
24400 than three or four segments.
24500 After certain verbs (See Fig. 4) a bracketing occurs to
24600 replace the commonly omitted "THAT", such that:
24700 ( I THINK YOU BE AFRAID )
24800 becomes
24900 (( I THINK ) ( YOU BE AFRAID ))
25000
25100 MATCHING INDIVIDUAL SEGMENTS
25200
25300 Conjunctions serve only as markers for the segmenter and they
25400 are dropped out after segmentation.
25500 Negations are handled by extracting the "NOT" from the
25600 segment and assigning a value to a global variable which indicates
25700 that the expression is negative in form. When a pattern is finally
25800 matched, this variable is consulted. Some patterns have a pointer to
25900 a pattern of opposite meaning if a "NOT" could reverse their
26000 meanings. If this pointer is present and a "NOT" was found, then
26100 the pattern matched is replaced by its opposite, e.g. ( I not trust
26200 you ) is replaced by the pattern ( I mistrust you ). We have not yet
26300 observed the troublesome case of "he gave me not one but two
26400 messages". (There is no need to scratch where it doesn't itch).
26500 Substitutions are also made in certain cases. Some segments
26600 contain pronouns which could stand for a number of different things
26700 of importance to PARRY2. As we mentioned in the introduction,
26800 the response functions of memory keep track of the context in order
26900 to give pronouns and other anaphoras a correct interpretation. For
27000 example, the segment:
27100 ( DO YOU AVOID THEM )
27200 could refer to the Mafia, or racetracks, or other patients, depending
27300 on the context. When such a segment is encountered, the pronoun is
27400 replaced by its current anaphoric value as determined by the response
27500 functions, and a more specific segment such as:
27600 ( DO YOU AVOID MAFIA )
27700 is looked up.
27800 Other utterances, such as "Why did you do that?" or just
27900 "Why?" (which might be regarded as a massive ellipsis), clearly refer
28000 back to previous utterances. These utterances match very general
28100 patterns which identify the type of question without indicating the
28200 exact topic. The response function which responds to "Why?" consults
28300 the context to produce an appropriate answer.
28400 The algorithm next attempts to match the segments with stored
28500 simple patterns which currently number about 1700. (Examples of
28600 ***** KEN *****
28700 The......
28800 ***** KEN *****
28900 simple patterns appear in Appendix 3). First a complete and perfect
29000 match is sought. When a match is found, the stored pattern name has
29100 a pointer to the name of a response function in memory which decides
29200 what to do further. If a match is not found, further transformations
29300 of the segment are carried out and a "fuzzy" match is tried.
29400 For fuzzy matching at this stage, we adopted the heuristic
29500 rule of dropping elements in the segment one at a time and attempting
29600 a match each time. This heuristic allows ignoring familiar words in
29700 unfamiliar contexts. For example, "well" is important in "Are you
29800 well?" but meaningless in "Well are you?".
29900 Deleting one element at a time results in, for example, the
30000 pattern:
30100 ( WHAT BE YOU MAIN PROBLEM )
30200 becoming successively:
30300 (a) ( BE YOU MAIN PROBLEM )
30400 (b) ( WHAT YOU MAIN PROBLEM )
30500 (c) ( WHAT BE MAIN PROBLEM )
30600 (d) ( WHAT BE YOU PROBLEM )
30700 (e) ( WHAT BE YOU MAIN )
30800 Since the stored pattern in this case matches (d), (e) would not be
30900 constructed. We found it unwise to delete more than one element
31000 since our segmentation method usually yields segments containing a
31100 small number (1-4) of words.
31200 Dropping an element at a time provides a probability
31300 threshold for fuzzy matching which is a function of the length of
31400 the segment. If a segment consists of five elements, four of the five
31500 must be present in a particular order (with the fifth element missing
31600 in any position) for a match to occur. If a segment contains four
31700 elements, three must match - and so forth.
31800
31900 COMPOUND-PATTERN MATCH
32000
32100 When more than one simple pattern is detected in the input, a
32200 second matching is attempted against about 500 compound patterns.
32300 Certain patterns, such as ( HELLO ) and ( I THINK ), are dropped
32400 because they are considered meaningless. If a complete match is not
32500 found, then simple patterns are dropped, one at a time, from the
32600 complex pattern. This allows the input,
32700 (( HOW DO YOU COME ) ( TO BE ) ( IN HOSPITAL ))
32800 to match the stored pattern,
32900 (( HOW DO YOU COME ) ( IN HOSPITAL )).
33000
33100 If no match can be found at this point, the algorithm has
33200 arrived at a default condition and the appropriate response functions
33300 decide what to do. For example, in a default condition, the model
33400 may assume control of the interview, asking the interviewer a
33500 question, continuing with the topic under discussion or introducing a
33600 new topic.
33700 An annotated example of a diagnostic psychiatric interview is
33800 presented in Appendix 1.
33900
34000
34100 ADVANTAGES AND LIMITATIONS
34200
34300 As mentioned, one of the main advantages of a
34400 pattern-matching strategy is that it can ignore as irrelevant both
34500 some of what it recognizes and what it does not recognize at all.
34600 There are several million words in English, each possessing from one
34700 to over a hundred senses. To construct a machine-usable word
34800 dictionary of this magnitude is out of the question at this time.
34900 Recognition of natural language input in the manner described above
35000 allows real-time interaction in a dialogue since it avoids becoming
35100 ensnarled in combinatorial disambiguations and long chains of
35200 inferencing which would slow a dialogue algorithm down to
35300 impracticality, if it could even function at all. The price paid for
35400 pattern-matching is that sometimes, but rarely, ambiguities slip
35500 through.
35600 Another advantage of this method is its speed. The algorithm
35700 consists of about 28K of programs written in MLISP, 16K of data in
35800 LISP, and 16K of data in machine language with several overlays. The
35900 complete language recognition process requires less than one second
36000 of real time on a time-shared DEC PDP-10.
36100 A drawback to PARRY1 is that it reacts to the first pattern
36200 it finds in the input rather than characterizing the input as fully
36300 as possible and then deciding what to do based on a number of tests.
36400 Another practical difficulty with PARRY1 from a programmer's
36500 viewpoint, is that, since it is a procedural model, elements of the
36600 patterns are strung out in various procedures throughout the
36700 algorithm. It is often a considerable chore for the programmer to
36800 determine whether a given pattern is present and precisely where it
36900 is. In PARRY2 the patterns are all collected in one part of the
37000 data-base where they can easily be examined.
37100 Concentrating all the patterns in the data base gives PARRY2
37200 a limited "learning" ability. When an input fails to match any
37300 stored pattern or matches an incorrect one, as judged by a human
37400 operator, a pattern which matches the input can be put into the
37500 data-base automatically. If the new pattern has the same meaning as
37600 a previously stored pattern, the human operator must provide the name
37700 of the appropriate response function. If he doesn't remember the
37800 name, he may try to rephrase the input in a form recognizable to
37900 PARRY2 and it will name the response function associated with the
38000 rephrasing. These mechanisms are not "learning" in the commonly used
38100 sense but they do allow a person to transfer his knowledge into
38200 PARRY2's data-base with very little effort.
38300 Informal observation thus far shows PARRY2's linguistic
38400 recognition abilities to be quite superior to PARRY1's. A more
38500 systematic and quantitative evaluation of performance is now being
38600 carried out. PARRY1 was extensively tested by having judges make
38700 ratings of its performance along several dimensions, one of which was
38800 linguistic noncomprehension (Colby and Hilf, 1974). These judges also
38900 made ratings of teletyped interviews with psychiatric patients and
39000 with a random version of PARRY1. The mean ratings of PARRY1 along the
39100 dimension of linguistic noncomprehension were better than those
39200 received by RANDOM-PARRY but were three times worse than the mean
39300 ratings received by patients. Once the ratings of PARRY2 along this
39400 dimension are completed, we will be able to compare them with those
39500 of PARRY1 and the patients and obtain a more objective measure of
39600 improvement.
39700
39800
39900 REFERENCES
40000
40100 Colby, K.M., Weber, S., and Hilf, F.D. (1971). Artificial Paranoia.
40200 ARTIFICIAL INTELLIGENCE, 2, 1-25.
40300
40400 Colby, K.M. and Hilf, F.D. (1974). Multidimensional Evaluation of
40500 of a Computer Simulation of Paranoid Thought. To appear
40600 in KNOWLEDGE AND COGNITION, L. Gregg, (Ed.)
40700
40800 Wilks, Y. (1973). An Artificial Intelligence Approach to Machine
40900 Translation. In COMPUTER MODELS OF THOUGHT AND
41000 LANGUAGE, R.C.Schank and K.M. Colby, Eds., W.H. Freeman,
41100 San Francisco.
41200
41300 Winograd, T.A. (1972). A Program for Understanding Natural
41400 Language, COGNITIVE PSYCHOLOGY, 3, 1-191.
41500
41600 Woods, W.A. Transition Network Grammars for Natural Language
41700 Analysis. COMMUNICATIONS OF THE ACM, 13, 591-606.